Incremental Haplotype Inference, Phylogeny, and Almost Bipartite Graphs∗

نویسنده

  • Peter Damaschke
چکیده

We address the combinatorial problem of inferring haplotypes in a population that forms a perfect phylogeny (PP) given a sample of genotypes. The problem is relevant because, in DNA sequencing, genotypes are easier to obtain than haplotyping by DNA sequencing. Since PP’s appear naturally and frequently on DNA sequences of restricted length, PP haplotyping is a favourable approach to facilitate reliable haplotype inference. Since Gusfield’s seminal paper from 2002, a number of different algorithms have been proposed. Here we give an algorithm that identifies haplotypes incrementally (along the sequence). Under the random mating assumption, all sufficiently frequent haplotypes are inferred from a random genotype sample of asymptotically optimal size. By its extreme simplicity, the idea of the algorithm easily extends to more general population structures. This can be beneficial because the strict PP assumption is easily violated in reality. Missing data can also be recovered by incremental haplotyping, if they are not too prevalent. In a more graph-theoretic part of this work we solve a problem we call almost-2-coloring of graphs, which arises in an enhanced version of our haplotyping algorithm. We show that the solution space of this graph problem can be computed in

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Haplotype Block Partitioning and tagSNP Selection under the Perfect Phylogeny Model

Single Nucleotide Polymorphisms (SNPs) are the most usual form of polymorphism in human genome.Analyses of genetic variations have revealed that individual genomes share common SNP-haplotypes. Theparticular pattern of these common variations forms a block-like structure on human genome. In this work,we develop a new method based on the Perfect Phylogeny Model to identify haplo...

متن کامل

META-HEURISTIC ALGORITHMS FOR MINIMIZING THE NUMBER OF CROSSING OF COMPLETE GRAPHS AND COMPLETE BIPARTITE GRAPHS

The minimum crossing number problem is among the oldest and most fundamental problems arising in the area of automatic graph drawing. In this paper, eight population-based meta-heuristic algorithms are utilized to tackle the minimum crossing number problem for two special types of graphs, namely complete graphs and complete bipartite graphs. A 2-page book drawing representation is employed for ...

متن کامل

Balanced Degree-Magic Labelings of Complete Bipartite Graphs under Binary Operations

A graph is called supermagic if there is a labeling of edges where the edges are labeled with consecutive distinct positive integers such that the sum of the labels of all edges incident with any vertex is constant. A graph G is called degree-magic if there is a labeling of the edges by integers 1, 2, ..., |E(G)| such that the sum of the labels of the edges incident with any vertex v is equal t...

متن کامل

Linear Reduction for Haplotype Inference

Haplotype inference problem asks for a set of haplotypes explaining a given set of genotypes. Popular software tools for haplotype inference (e.g., PHASE, HAPLOTYPER) as well as new algorithms recently proposed for perfect phylogeny inference (DPPH) are often not well scalable. When the number of sites (SNP’s) comes to thousands these tools often cannot deliver answer in reasonable time even if...

متن کامل

Toward an algebraic understanding of haplotype inference by pure parsimony.

Haplotype inference by pure parsimony (HIPP) is known to be NP-Hard. Despite this, many algorithms successfully solve HIPP instances on simulated and real data. In this paper, we explore the connection between algebraic rank and the HIPP problem, to help identify easy and hard instances of the problem. The rank of the input matrix is known to be a lower bound on the size an optimal HIPP solutio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004